FIELDimageR pipeline:
Agriculture and plant breeding image analysis on R environment
Introduction
Agriculture and plant breeding
Images can be used in plant phenotyping to draw inference about many traits:
- Geometric traits (i.e. plant height, leaf area index, lodging, crop canopy cover)
- Canopy spectral texture (spectral features)
- Physiological traits (i.e., chlorophyll, biomass, pigment content, photosynthesis)
- Abiotic/biotic stress indicators (i.e., stomatal conductance, canopy temperature difference, leaf water potential, senescence index)
- Nutrients (nitrogen concentration, protein content)
- Yield
FIELDimageR
FIELDimageR is a R package to analyze images and plant phenotyping, and allows to:
- Crop the image
- Remove soil effect
- Build vegetation indices
- Rotate the image
- Build the plot shapefile
- Extract information for each plot
- Evaluate stand count, canopy percentage, and plant height
FIELDimageR pipeline
1) Example 01
Remote sensing (Potato Breeding)
Jeffrey Endelman and Filipe Matias (UW-Madison)
Required packages
## Installing
# install.packages("sp")
# install.packages("raster")
# install.packages("rgdal")
# install.packages("ggplot2")
# install.packages("agricolae")
# install.packages("reshape2")
# install.packages("devtools")
# install.packages("lme4")
# install.packages("plyr")
# install.packages("DescTools")
# devtools::install_github("filipematias23/FIELDimageR")
## Necessary packages
library(FIELDimageR)
library(raster)
library(agricolae)
library(reshape2)
library(ggplot2)
library(lme4)
library(plyr)
library(DescTools)Data Info (R/FIELDimageR)
- Potato breeding: 196 plots (3x15 feet)
- Population: 138 genotypes
- HTP Platform (Dr. Philip Townsend UM-Madison): Cessna-180 airplane + HySpex VNIR-1800 + HySpex SWIR-384
- Hyperspectral Data: 474 bands
Donwload Data:
- Hyperspectral Image ‘EX_HYP.tif’
- Wavelengths Names ‘namesHYP.csv’
## Uploading hyperspectral file with 474 bands (EX_HYP.tif):
EX.HYP<-stack("EX_HYP.tif")
## Wavelengths (namesHYP.csv):
NamesHYP<-as.character(read.csv("namesHYP.csv")$NameHYP)
names(EX.HYP)<-NamesHYP
head(NamesHYP)Building RGB image from the hyperspectral:
# RGB from hyperspectral:
R<-EX.HYP$X654.788879 # 654nm (Red)
G<-EX.HYP$X552.598572 # 552nm (Green)
B<-EX.HYP$X450.408295 # 450nm (Blue)
RGB<-stack(c(R,G,B))
plotRGB(RGB, stretch="lin")Making the mask by using the RGB to calculate vegetation indices and to remove the soil
## Removing soil using RGB (index NGRDI):
RGB.S<-fieldMask(RGB,index="NGRDI",cropValue = 0.0, cropAbove = F)Building the plot shapefile
- Dataset - Field notes - phenotype (Download: ‘DataHYP.csv’)
## Data frame with field information to make the Map:
Data<-as.data.frame(read.csv("DataHYP.csv"))
head(Data)- EX.HYP field design (MAP) should start from top to bottom and right to left. In other words, plot 1 should be where the blue arrow is showing in the image below.
Map<-fieldMap(fieldPlot = as.character(Data$Plot),fieldRow = as.character(Data$Range),fieldColumn = as.character(Data$Row),decreasing = T)
Map## Building plot shapefile using RGB as base ( 14 columns and 14 rows):
plotFile<-fieldShape(RGB.S,ncols = 14, nrows = 14,
fieldMap = Map,fieldData = Data, ID = "Plot")Removing soil from the hyperspectral using the mask from the RGB (Time: 5-10 min). Download the final result: ‘EX_HYP_S.tif’
## Removing soil from the hyperspectral:
# EX.HYP.S<-fieldMask(EX.HYP,mask = RGB.S$mask, plot = F)
# writeRaster(EX.HYP.S$newMosaic, filename="EX_HYP_S.tif", options="INTERLEAVE=BAND", overwrite=TRUE)
EX.HYP.S<-stack("EX_HYP_S.tif")
names(EX.HYP.S)<-NamesHYP
plot(EX.HYP.S$X750.592224, col=grey(100:1/100), axes=FALSE, box=FALSE)
plot(plotFile$fieldShape, add=T)
## Reducing resolution to accelerate data extraction por plot (e.g., average values):
EX.HYP.S<-aggregate(EX.HYP.S,2)
plot(EX.HYP.S$X750.592224, col=grey(100:1/100), axes=FALSE, box=FALSE)
plot(plotFile$fieldShape, add=T)Extracting data from 474 bands (Time: 3-10 min).
## Extracting data (474 bands):
EX.HYP.I<-fieldInfo(EX.HYP.S, # EX.HYP.S$newMosaic,
fieldShape = plotFile$fieldShape,
n.core = 1)
## Saving the new csv with hyperspectral information per plot:
DataHYP<-EX.HYP.I$fieldShape@data
colnames(DataHYP)<-c(colnames(DataHYP)[1:9],NamesHYP)
DataHYP[1:5,1:12]
# write.csv(DataHYP,"DataHypNew.csv",col.names = T,row.names = F)Making graphics to visualize the data:
## Visualizing the extracted data from FIELDimageR:
dev.off()
DataHYP1<-EX.HYP.I$plotValue[,-1]
## Plotting the data:
plot(x=as.numeric(NamesHYP),y=as.numeric(DataHYP1[1,]),type = "l",xlab = "Wavelength (nm)",ylab = "Reflectance",
col="black",lwd=2,cex.lab=1.2)
for(i in 2:dim(DataHYP1)[1]){
lines(x=as.numeric(NamesHYP),y=as.numeric(DataHYP1[i,]),type = "l",col=i,lwd=2)
}
legend("topright",
c("Blue (400-480nm)",
"Green (480-600nm)",
"Red (600-680nm)",
"RedEdge (680-750nm)",
"NIR (750-1300nm)",
"SWIR-1 (1500-1800nm)",
"SWIR-2 (2000-2400nm)"),
col =c("black"),box.lty=0,cex = 0.8)“Best practices” to use hyperspectral data:
- Removing atmospheric absorption regions
- Vector normalization to reduce the effects of illumination conditions (Feilhauer et al., 2010)
## Removing atmospheric absorption regions:
DataHYP2<-DataHYP[,-c(1:9)]
Wavelenght<-as.numeric(colnames(DataHYP2))
DataHYP2<-DataHYP2[,!(Wavelenght>1290&Wavelenght<1510|
Wavelenght>1790&Wavelenght<2040|
Wavelenght>2350&Wavelenght<2550)]
## Vector normalization to reduce the effects of illumination conditions:
DataHYP2<-as.matrix(t(apply(DataHYP2, 1, function(x){
x/sqrt(sum(as.numeric(x)^2))
})))
## Plotting the new data:
plot(x=as.numeric(colnames(DataHYP2)),y=as.numeric(DataHYP2[1,]),type = "p",
xlab = "Wavelength (nm)",ylab = "Reflectance",
col="black",pch=19,cex.lab=1.2,
cex=0.5)
for(i1 in 2:dim(DataHYP2)[1]){
lines(x=as.numeric(colnames(DataHYP2)),y=as.numeric(DataHYP2[i1,]),type = "p",col=i1,pch=19,cex=0.5)
}
legend("topright",
c("Blue (400-480nm)",
"Green (480-600nm)",
"Red (600-680nm)",
"RedEdge (680-750nm)",
"NIR (750-1300nm)",
"SWIR-1 (1500-1800nm)",
"SWIR-2 (2000-2400nm)"),
col =c("black"),box.lty=0,cex = 0.8)
## Saving the new csv table:
DataHYP3<-cbind(DataHYP[,c(1:9)],DataHYP2)
# write.csv(DataHYP3,"DataHypNew2.csv",col.names = T,row.names = F)Correlation (r) between wavelengths and field manual collected data (e.g., yield, plant height, etc.)
## Scale the wavelengths values to correct for any unit effects:
DataHYP4<-scale(DataHYP2,scale = T)
## Correlation between wavelengths and Trait_1:
r.Hyp<-NULL
for(r in 1:dim(DataHYP4)[2]){
r.Hyp<-c(r.Hyp,cor(DataHYP4[,r],scale(DataHYP$Trait_1,scale = T),use = "pairwise.complete.obs"))
}
## Plotting correlation values:
plot(x=as.numeric(colnames(DataHYP4)),y=r.Hyp,type = "o",
xlab = "Wavelength (nm)",
ylab = "r",
col="black",pch=19,cex.lab=1.2
)
abline(h=0,col="red",lty=2,lwd=3)
legend("topright",
c("Blue (400-480nm)",
"Green (480-600nm)",
"Red (600-680nm)",
"RedEdge (680-750nm)",
"NIR (750-1300nm)",
"SWIR-1 (1500-1800nm)",
"SWIR-2 (2000-2400nm)"),
col =c("black"),box.lty=0,cex = 0.8)Estimating traits’ heritability
## Preparing the data:
str(DataHYP3)
DataHYP3$Name<-as.factor(DataHYP3$Name)
DataHYP3$Block<-as.factor(DataHYP3$Block)
Trait<-c("Trait_1",paste0("`",colnames(DataHYP3)[-(1:9)],"`",sep="")) - Using mixed models to evaluate the data and estimate the heritability
## Using the package "lme4" for mixed modeling:
H2<-NULL
for(t in 1:length(Trait)){
mod<-lmer(eval(parse(text = paste(Trait[t],"~Block+(1|Name)",sep=""))),data = DataHYP3)
H2<-c(H2,c(as.data.frame(VarCorr(mod))$vcov[1]/sum(as.data.frame(VarCorr(mod))$vcov)))
}
Trait_1.H2<-H2[1]
Trait_1.H2 # H2=81%
## Plotting heritability values:
plot(x=as.numeric(colnames(DataHYP2)),y=as.numeric(H2[-1]),type = "o",
xlab = "Wavelength (nm)",ylab = "H2",
col="black",pch=19,cex.lab=1.2,
ylim=c(0.1,0.9),
cex=0.5)
abline(h=Trait_1.H2,col="blue",lty=2,lwd=3)
text(x = 1300, y = (Trait_1.H2+0.03), paste("Trait_1 (H2=",round(Trait_1.H2,2),")",sep=""),col="blue")
legend(list(x = 1600,y = (Trait_1.H2-0.03)),
c("Blue (400-480nm)",
"Green (480-600nm)",
"Red (600-680nm)",
"RedEdge (680-750nm)",
"NIR (750-1300nm)",
"SWIR-1 (1500-1800nm)",
"SWIR-2 (2000-2400nm)"),
col =c("black"),box.lty=0,cex = 0.8)2) Example 02
Evaluating disease damage (Tomato Breeding)
Fernando Piotto and Jéssica Nogueira (ESALQ/USP)
- Donwload:
EX2.zip - Pictures of 10 genotypes with Xanthomonas perforans
## Vegetation indices:
index<- c("BI","GLI","NGRDI")
## Choose one image to prepare the pipeline:
EX.L1<-stack(paste("./EX2/",pics[1],sep = ""))
plotRGB(EX.L1)## Shapefile with extent=T (The whole image area will be the shapefile):
EX.L.Shape<-fieldPolygon(mosaic=EX.L1, extent=T) ## Select one index to identify leaves and remove the background:
EX.L2<-fieldMask(mosaic=EX.L1, index="BGI", cropValue=0.8, cropAbove=T) ## Select one index to identify demaged area in the leaves:
EX.L3<-fieldMask(mosaic=EX.L2$newMosaic, index="VARI", cropValue=0.1, cropAbove=T) # Making a new stack raster with new layers (demage area and indices)
EX.L5<-stack(EX.L4[[index]],(1-EX.L2$mask),(1-EX.L3$mask))
names(EX.L5)<-c(index,"Area","Demage")
plot(EX.L5)## projection=F (Ignore projection. Normally used only with remote sensing images):
EX.L.Info<- fieldInfo(mosaic=EX.L5,
fieldShape=EX.L.Shape$fieldShape,
projection=F)
## Combine information from all images in one table:
EX.L.Info$plotValue
* Parallel
## Installing:
# install.packages("foreach")
# install.packages("parallel")
# install.packages("doParallel")
## Necessary packages:
library(foreach)
library(parallel)
library(doParallel)
## Number of cores:
n.core<-3
## Starting parallel:
cl <- makeCluster(n.core, output = "")
registerDoParallel(cl)
system.time({
EX.Table <- foreach(i = 1:length(pics), .packages = c("raster","FIELDimageR"),
.combine = rbind) %dopar% {
EX.L1<-stack(paste("./EX2/",pics[i],sep = ""))
EX.L.Shape<-fieldPolygon(mosaic=EX.L1, extent=T, plot=F)
EX.L2<-fieldMask(mosaic=EX.L1, index="BGI", cropValue=0.8, cropAbove=T, plot=F)
EX.L3<-fieldMask(mosaic=EX.L2$newMosaic, index="VARI", cropValue=0.1, cropAbove=T, plot=F)
EX.L4<-fieldIndex(mosaic=EX.L2$newMosaic, index=index, plot=F)
EX.L5<-stack(EX.L4[[index]],(1-EX.L2$mask),(1-EX.L3$mask))
names(EX.L5)<-c(index,"Area","Demage")
EX.L.Info<- fieldInfo(mosaic=EX.L5, fieldShape=EX.L.Shape$fieldShape, projection=F)
EX.L.Info$plotValue
}})
EX.Table.2<-data.frame(Genotype=do.call(rbind,strsplit(pics,split = ".jpeg")),EX.Table[,-1])
EX.Table.2
- Dataset: Visual Score
- Horsfall-Barratt scale
- 4 Evaluators
- (Download: ‘EX2_Data.txt’)
## Field data:
DataEX2<-read.table("EX2_Data.txt",header = T)
DataEX2$Score_HB<-as.numeric(as.character(DataEX2$Score_HB))
DataEX2$Person<-as.factor(DataEX2$Person)
DataEX2$Genotype<-as.factor(DataEX2$Genotype)
DataEX2ggplot(DataEX2,aes(x=Genotype,y=Score_HB,fill=Person))+
facet_wrap(~Genotype,scales = "free_x",nrow = 2)+
geom_bar(stat="identity",position=position_dodge())+
scale_fill_grey()+
theme_bw()+
theme(axis.text.x=element_blank(),
axis.ticks.x=element_blank())## Regression:
DataEX2.1<-ddply(DataEX2,~Genotype,summarise,Score_HB=mean(Score_HB))
DataEX2.2<-merge(DataEX2.1,EX.Table.2,by="Genotype")
DataEX2.2DataEX2.3<-melt(DataEX2.2,
value.name = "Index",
measure.vars = c("Demage","BI","GLI","NGRDI"))
DataEX2.3$Score_HB<-as.numeric(DataEX2.3$Score_HB)
DataEX2.3$Index<-as.numeric(DataEX2.3$Index)
DataEX2.3$variable<-as.factor(DataEX2.3$variable)
DataEX2.3ggplot(DataEX2.3,aes(x=Index,y=Score_HB,col=variable))+
facet_wrap(~variable,scales = "free_x")+
geom_point() +
geom_smooth(method=lm)+
theme_bw()Reference
Matias, F.I. (2019). FIELDimageR Pipeline (tutorial). https://github.com/filipematias23/FIELDimageR
Matias, F.I.; Caraza-Harter, M.; Endelman, J.B. (2020). FIELDimageR: A R Package to Analyze Orthomosaic Images from Agricultural Field Trials. The Plant Phenome Journal. DOI:10.1002/ppj2.20005